AITopics | Republic of Tuva

Collaborating Authors

Republic of Tuva

Empowering Children to Create AI-Enabled Augmented Reality Experiences

Zhang, Lei, Zhou, Shuyao, Liaqat, Amna, Mak, Tinney, Berengard, Brian, Qian, Emily, Monroy-Hernández, Andrés

arXiv.org Artificial IntelligenceAug-13-2025

Despite their potential to enhance children's learning experiences, AI-enabled AR technologies are predominantly used in ways that position children as consumers rather than creators. We introduce Capybara, an AR-based and AI-powered visual programming environment that empowers children to create, customize, and program 3D characters overlaid onto the physical world. Capybara enables children to create virtual characters and accessories using text-to-3D generative AI models, and to animate these characters through auto-rigging and body tracking. In addition, our system employs vision-based AI models to recognize physical objects, allowing children to program interactive behaviors between virtual characters and their physical surroundings. We demonstrate the expressiveness of Capybara through a set of novel AR experiences. We conducted user studies with 20 children in the United States and Argentina. Our findings suggest that Capybara can empower children to harness AI in authoring personalized and engaging AR experiences that seamlessly bridge the virtual and physical worlds.

large language model, machine learning, programming language, (17 more...)

arXiv.org Artificial Intelligence

2508.08467

Country:

Asia > South Korea > Busan > Busan (0.05)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
South America > Argentina > Pampas > Buenos Aires F.D. > Buenos Aires (0.04)
(3 more...)

Genre:

Research Report > New Finding (1.00)
Questionnaire & Opinion Survey (1.00)

Industry:

Education > Curriculum > Subject-Specific Education (0.93)
Information Technology (0.92)
Education > Educational Setting > K-12 Education > Primary School (0.46)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Add feedback

Reality Proxy: Fluid Interactions with Real-World Objects in MR via Abstract Representations

Liu, Xiaoan, Jia, Difan, Liu, Xianhao Carton, Gonzalez-Franco, Mar, Zhu-Tian, Chen

arXiv.org Artificial IntelligenceJul-25-2025

Interacting with real-world objects in Mixed Reality (MR) often proves difficult when they are crowded, distant, or partially occluded, hindering straightforward selection and manipulation. We observe that these difficulties stem from performing interaction directly on physical objects, where input is tightly coupled to their physical constraints. Our key insight is to decouple interaction from these constraints by introducing proxies-abstract representations of real-world objects. We embody this concept in Reality Proxy, a system that seamlessly shifts interaction targets from physical objects to their proxies during selection. Beyond facilitating basic selection, Reality Proxy uses AI to enrich proxies with semantic attributes and hierarchical spatial relationships of their corresponding physical objects, enabling novel and previously cumbersome interactions in MR - such as skimming, attribute-based filtering, navigating nested groups, and complex multi object selections - all without requiring new gestures or menu systems. We demonstrate Reality Proxy's versatility across diverse scenarios, including office information retrieval, large-scale spatial navigation, and multi-drone control. An expert evaluation suggests the system's utility and usability, suggesting that proxy-based abstractions offer a powerful and generalizable interaction paradigm for future MR systems.

artificial intelligence, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3746059.3747709

2507.17248

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)
North America > United States > New York > New York County > New York City (0.14)
North America > Canada > Newfoundland and Labrador > Newfoundland > St. John's (0.14)
(16 more...)

Genre:

Research Report (1.00)
Questionnaire & Opinion Survey (1.00)

Industry:

Health & Medicine (0.94)
Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
(3 more...)

Add feedback

Exploring the Innovation Opportunities for Pre-trained Models

Park, Minjung, Forlizzi, Jodi, Zimmerman, John

arXiv.org Artificial IntelligenceMay-22-2025

Innovators transform the world by understanding where services are successfully meeting customers' needs and then using this knowledge to identify failsafe opportunities for innovation. Pre-trained models have changed the AI innovation landscape, making it faster and easier to create new AI products and services. Understanding where pre-trained models are successful is critical for supporting AI innovation. Unfortunately, the hype cycle surrounding pre-trained models makes it hard to know where AI can really be successful. To address this, we investigated pre-trained model applications developed by HCI researchers as a proxy for commercially successful applications. The research applications demonstrate technical capabilities, address real user needs, and avoid ethical challenges. Using an artifact analysis approach, we categorized capabilities, opportunity domains, data types, and emerging interaction design patterns, uncovering some of the opportunity space for innovation with pre-trained models.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3715336.3735753

2505.1579

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
Asia > Russia > Siberian Federal District > Republic of Tuva > Kyzyl (0.04)
Asia > China (0.04)

Genre: Research Report > New Finding (0.67)

Industry:

Education (1.00)
Banking & Finance (1.00)
Information Technology (0.93)
(3 more...)

Technology:

Information Technology > Human Computer Interaction > Interfaces (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
(5 more...)

Add feedback

Augmenting Human Cognition through Everyday AR

Liu, Xiaoan

arXiv.org Artificial IntelligenceMay-7-2025

As spatial computing and multimodal LLMs mature, AR is tending to become an intuitive "thinking tool," embedding semantic and context-aware intelligence directly into everyday environments. This paper explores how always-on AR can seamlessly bridge digital cognition and physical affordances, enabling proactive, context-sensitive interactions that enhance human task performance and understanding.

artificial intelligence, large language model, natural language, (15 more...)

arXiv.org Artificial Intelligence

2505.03492

Country:

North America > United States > New York > New York County > New York City (0.07)
Asia > Russia > Siberian Federal District > Republic of Tuva > Kyzyl (0.05)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.36)

Add feedback

Peek into the `White-Box': A Field Study on Bystander Engagement with Urban Robot Uncertainty

Yu, Xinyan, Hoggenmueller, Marius, Tran, Tram Thi Minh, Wang, Yiyuan, Zhang, Qiuming, Tomitsch, Martin

arXiv.org Artificial IntelligenceFeb-28-2025

Uncertainty inherently exists in the autonomous decision-making process of robots. Involving humans in resolving this uncertainty not only helps robots mitigate it but is also crucial for improving human-robot interactions. However, in public urban spaces filled with unpredictability, robots often face heightened uncertainty without direct human collaborators. This study investigates how robots can engage bystanders for assistance in public spaces when encountering uncertainty and examines how these interactions impact bystanders' perceptions and attitudes towards robots. We designed and tested a speculative `peephole' concept that engages bystanders in resolving urban robot uncertainty. Our design is guided by considerations of non-intrusiveness and eliciting initiative in an implicit manner, considering bystanders' unique role as non-obligated participants in relation to urban robots. Drawing from field study findings, we highlight the potential of involving bystanders to mitigate urban robots' technological imperfections to both address operational challenges and foster public acceptance of urban robots. Furthermore, we offer design implications to encourage bystanders' involvement in mitigating the imperfections.

bystander, computing machinery, robot, (11 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3706598.3713790

2503.00337

Country:

Europe > Austria > Vienna (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
(32 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Transportation (0.68)
Health & Medicine (0.47)
Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.67)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.47)

Add feedback

Co-Designing Augmented Reality Tools for High-Stakes Clinical Teamwork

Taylor, Angelique, Tanjim, Tauhid, Cao, Huajie, Nicoly, Jalynn Blu, Segal, Jonathan I., George, Jonathan St., Kim, Soyon, Ching, Kevin, Ortega, Francisco R., Lee, Hee Rin

arXiv.org Artificial IntelligenceFeb-24-2025

How might healthcare workers (HCWs) leverage augmented reality head-mounted displays (AR-HMDs) to enhance teamwork? Although AR-HMDs have shown immense promise in supporting teamwork in healthcare settings, design for Emergency Department (ER) teams has received little attention. The ER presents unique challenges, including procedural recall, medical errors, and communication gaps. To address this gap, we engaged in a participatory design study with healthcare workers to gain a deep understanding of the potential for AR-HMDs to facilitate teamwork during ER procedures. Our results reveal that AR-HMDs can be used as an information-sharing and information-retrieval system to bridge knowledge gaps, and concerns about integrating AR-HMDs in ER workflows. We contribute design recommendations for seven role-based AR-HMD application scenarios involving HCWs with various expertise, working across multiple medical tasks. We hope our research inspires designers to embark on the development of new AR-HMD applications for high-stakes, team environments.

ar-hmd, information, participant, (14 more...)

arXiv.org Artificial Intelligence

2502.17295

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Michigan > Ingham County > Lansing (0.04)
(13 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Health Care Technology > Telehealth (1.00)
Health & Medicine > Health Care Providers & Services (1.00)

Technology:

Information Technology > Human Computer Interaction > Interfaces > Virtual Reality (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence (1.00)

Add feedback

Toyteller: AI-powered Visual Storytelling Through Toy-Playing with Character Symbols

Chung, John Joon Young, Roemmele, Melissa, Kreminski, Max

arXiv.org Artificial IntelligenceJan-22-2025

We introduce Toyteller, an AI-powered storytelling system where users generate a mix of story text and visuals by directly manipulating character symbols like they are toy-playing. Anthropomorphized symbol motions can convey rich and nuanced social interactions; Toyteller leverages these motions (1) to let users steer story text generation and (2) as a visual output format that accompanies story text. We enabled motion-steered text generation and text-steered motion generation by mapping motions and text onto a shared semantic space so that large language models and motion generation models can use it as a translational layer. Technical evaluations showed that Toyteller outperforms a competitive baseline, GPT-4o. Our user study identified that toy-playing helps express intentions difficult to verbalize. However, only motions could not express all user intentions, suggesting combining it with other modalities like language. We discuss the design space of toy-playing interactions and implications for technical HCI research on human-AI interaction.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3706598.3713435

2501.13284

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Asia > Japan > Honshū > Kantō > Kanagawa Prefecture > Yokohama (0.05)
(22 more...)

Genre:

Questionnaire & Opinion Survey (1.00)
Research Report > Experimental Study (0.92)

Industry: Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Ukraine captures North Korean soldiers; Russia readies for talks with Trump

Al JazeeraJan-15-2025, 14:13:25 GMT

Russia appeared to ready itself for talks on the future of Ukraine with United States President-elect Donald Trump ahead of his swearing-in on Monday. "No special conditions are needed for this. What is required is the mutual intent and political will to have a dialogue," said Russian President Vladimir Putin's spokesman Dmitry Peskov on Saturday. But Russia expressed its parameters very quickly. Putin aide Nikolai Patrushev told Russian news outlet KP that a Ukraine settlement should be reached by the US and Russia, without Ukraine and without the European Union.

north korean soldier, russia, ukraine, (11 more...)

Al Jazeera

Country:

North America > United States (1.00)
Asia > North Korea (0.57)
Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.44)
(22 more...)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Government > Regional Government > Europe Government > Russia Government (1.00)
Government > Regional Government > Asia Government > Russia Government (1.00)

Technology: Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.70)

Add feedback

Augmented Conversation with Embedded Speech-Driven On-the-Fly Referencing in AR

Jadon, Shivesh, Faridan, Mehrad, Mah, Edward, Vaish, Rajan, Willett, Wesley, Suzuki, Ryo

arXiv.org Artificial IntelligenceMay-28-2024

This paper introduces the concept of augmented conversation, which aims to support co-located in-person conversations via embedded speech-driven on-the-fly referencing in augmented reality (AR). Today computing technologies like smartphones allow quick access to a variety of references during the conversation. However, these tools often create distractions, reducing eye contact and forcing users to focus their attention on phone screens and manually enter keywords to access relevant information. In contrast, AR-based on-the-fly referencing provides relevant visual references in real-time, based on keywords extracted automatically from the spoken conversation. By embedding these visual references in AR around the conversation partner, augmented conversation reduces distraction and friction, allowing users to maintain eye contact and supporting more natural social interactions. To demonstrate this concept, we developed \system, a Hololens-based interface that leverages real-time speech recognition, natural language processing and gaze-based interactions for on-the-fly embedded visual referencing. In this paper, we explore the design space of visual referencing for conversations, and describe our our implementation -- building on seven design guidelines identified through a user-centered design process. An initial user study confirms that our system decreases distraction and friction in conversations compared to smartphone searches, while providing highly useful and relevant information.

information, interaction, participant, (12 more...)

arXiv.org Artificial Intelligence

2405.18537

Country:

North America > Canada > Alberta > Census Division No. 6 > Calgary Metropolitan Region > Calgary (0.29)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
Europe > France (0.04)
(5 more...)

Genre: Questionnaire & Opinion Survey (1.00)

Industry:

Information Technology (1.00)
Health & Medicine (0.95)

Technology:

Information Technology > Human Computer Interaction > Interfaces (1.00)
Information Technology > Communications > Mobile (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

RealitySummary: On-Demand Mixed Reality Document Enhancement using Large Language Models

Gunturu, Aditya, Jadon, Shivesh, Zhang, Nandi, Thundathil, Jarin, Willett, Wesley, Suzuki, Ryo

arXiv.org Artificial IntelligenceMay-28-2024

We introduce RealitySummary, a mixed reality reading assistant that can enhance any printed or digital document using on-demand text extraction, summarization, and augmentation. While augmented reading tools promise to enhance physical reading experiences with overlaid digital content, prior systems have typically required pre-processed documents, which limits their generalizability and real-world use cases. In this paper, we explore on-demand document augmentation by leveraging large language models. To understand generalizable techniques for diverse documents, we first conducted an exploratory design study which identified five categories of document enhancements (summarization, augmentation, navigation, comparison, and extraction). Based on this, we developed a proof-of-concept system that can automatically extract and summarize text using Google Cloud OCR and GPT-4, then embed information around documents using a Microsoft Hololens 2 and Apple Vision Pro. We demonstrate real-time examples of six specific document augmentations: 1) summaries, 2) comparison tables, 3) timelines, 4) keyword lists, 5) summary highlighting, and 6) information cards. Results from a usability study (N=12) and in-the-wild study (N=11) highlight the potential benefits of on-demand MR document enhancement and opportunities for future research.

enhancement, participant, realitysummary, (14 more...)

arXiv.org Artificial Intelligence

2405.1862

Country:

North America > Canada > Alberta > Census Division No. 6 > Calgary Metropolitan Region > Calgary (0.29)
Europe > Ukraine (0.14)
North America > United States > New York > New York County > New York City (0.04)
(2 more...)

Genre:

Questionnaire & Opinion Survey (1.00)
Research Report (0.82)
Overview (0.67)
Instructional Material > Course Syllabus & Notes (0.48)

Industry:

Education (1.00)
Information Technology > Services (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback